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Abstract. It is known that modeling uncertainties and astrophysical foregrounds can poten- 
tially introduce appreciable bias in the deduced values of cosmological parameters. While it 
is commonly assumed that these uncertainties will be accounted for to a sufficient level of 
precision, the level of bias has not been properly quantified in most cases of interest. We 
show that the requirement that the bias in derived values of cosmological parameters does 
not surpass nominal statistical error, translates into a maximal level of overall error 0(A^~2 ) 
on \AP{k)\/P{k) and |AC/|/C;, where P{k), Ci, and are the matter power spectrum, 
angular power spectrum, and number of (independent Fourier) modes at a given scale I or 
k probed by the cosmological survey, respectively. This required level has important conse- 
quences on the precision with which cosmological parameters are hoped to be determined by 
future surveys: In virtually all ongoing and near future surveys typically falls in the range 
10^ — 10^, implying that the required overall theoretical modeling and numerical precision is 
already very high. Future redshifted-21-cm observations, projected to sample ~ 10^^ modes, 
will require knowledge of the matter power spectrum to a fantastic 10~^ precision level. We 
conclude that realizing the expected potential of future cosmological surveys, which aim at 
detecting 10^ — 10^^ modes, sets the formidable challenge of reducing the overall level of 
uncertainty to 10~^ — 10~^. 
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1 Introduction 

The last two decades marked the beginning of the 'precision cosmology' era when many cos- 
mological surveys are conducted in order to determine about a dozen cosmological parame- 
ters, and test non-standard cosmological models. These cosmological surveys are motivated 
by detailed theoretical predictions, elaborately designed satellite and stratospheirc telescope 
systems, and computationally challenging data analysis and parameter inference procedures. 

It is often stated that the joint effort of a variety of cosmological probes will enable 
breaking some of the vexing degeneracies in cosmological parameter space, but perhaps more 
importantly, it is hoped that cross-correlating the results obtained with a battery of cos- 
mological probes could be used to mitigate inconsistencies due to the different systematics 
affecting each probe. 

In this work we generalize basic arguments, first discussed by Seljak et al. [1] in the 
context of assessing the degree of agreement between different Boltzmann codes employed 
to calculate power spectra of the cosmic microwave background (CMB). We argue that as 
cosmological surveys become more advanced by virtue of higher spectro-spatial resolution 
and sensitivity, larger volume coverage, and lower instrumental noise, the need for a better 
understanding of the theory, higher numerical precision, a model-independent description of 
nonlinear effects, astrophysical foregrounds, and instrumental systematics, are all essential 
for achieving the full benefit of these advanced cosmological probes. The quantitative impli- 
cations and ensuing ramifications of this (quite intuitive) statement are of primary interest 
for this work. 

The challenge stems from the fact that the statistical error in inferred cosmological 
parameters decreases as ~ iV~^/^, where N is the number of independent Fourier modes 
that can be probed by a given experiment. In contrast, the bias (induced by e.g. inaccurate 
theory or model, systematics, and foreground removal) does not decrease with the number 
of modes. The problem then arises with those advanced probes that target a large number 
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of modes where the tension between the two trends becomes sufficiently large to result in an 
unavoidable bias in the inferred cosmological parameters. 

Current cosmological probes access O(IO^) — O(IO^) modes, whereas next generation 
experiments aim at probing over 10^^ modes, most notably redshifted-21-cm observations, 
which allows 1-2 orders of magnitude tighter constraints to be placed on a few cosmologi- 
cal parameters (e.g. [2-5]). It is clear that either an unimaginable precision in theoretical 
modeling has to be achieved, or the number of modes used for parameter estimation must 
be cut at lower angular (2D) or spatial (3D) resolution than previously projected, thereby 
significantly degrading the scientific yield of these probes with respect to standard estimates. 

The outline of the paper is as follows. In section 2 we provide a general discussion 
of biased parameter inference in the framework of Fisher matrix formalism, followed by 
applications to angular and matter power spectra. In section 3 we discuss a few specific 
examples of systematics and modeling uncertainties and highlight their relevance to biased 
parameter inference. Our conclusions are summarized in section 4. 

2 Statistical Error and Bias 

Analyses of cosmological surveys, such as galaxy correlations, galaxy shear, CMB, and su- 
pernovae (SNIe) are commonly based on maximum likelihood methods to obtain the best 
fit cosmological model to the data. To assess bias in cosmological parameter extraction, we 
assume the likelihood function is gaussian in the power spectra. Specifically, this function 
is gaussian in the angular CMB power spectra, in the matter power spectrum of large scale 
structure (LSS) proxies (e.g. galaxy clustering, galaxy lensing, BAO, Lya, 21cm), and in the 
luminosity distance of SNe. While this function is poissonian in number counts of galaxy 
clusters, the numbers are sufficiently large to warrant a gaussian approximation. 

In the framework of standard cosmology, it is clear that its inherent degree of symmetry 
(either the global 2D sky isotropy or 3D spatial homogeneity of the universe) implies that 
the universe is densely sampled on small angular (2D) and physical scales (3D), and thereby 
limited only by the fundamental minimum scale that could be accessed by a specific statistical 
probe with a given experiment (i.e. by the multipole number / and wavenumber k in the 2D 
and 3D cases, respectively, rather than the 2D and 3D wave- vectors I and k). In the ideal case 
of no instrumental noise this implies that the only fundamental limit on precision is the so- 
called 'cosmic- variance-limit' (CVL), i.e. the fact that we observe only a single realization of 
the universe out of an infinitely large number of possible universes characterized by the same 
model (with all models having the exact same variance, i.e. power spectrum, but different 
one-point functions). 

We begin quantitative assessment of statistical error and bias by a brief derivation of 
the well-known expressions for these measures within the Fisher matrix formalism. We then 
apply this formalism specifically to the angular and matter power spectra. 

Without limiting the generality of the discussion we assume the likelihood function is 
gaussian in the data, at least in the large numbers limit. For a quantity d„ for which there 
are rimax data points dn, where n = 1, ...,nmax stands for the multipole number I in the case 
of CMB and galaxy lensing, k and z in the case of large scale proxies, and redshift bin in the 
case of SNe and number counts, the likelihood function can then be written as 
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where 5dn is the statistical uncertainty on (i„ due to, e.g., observational, cosmic variance, and 
shot noise (in the case of galaxy lensing). Let us assume that there are M model parameters, 
A = (Ai, Aa/ ). Taylor expansion of dn around the best fit model, Aq, keeping terms to 
first order, we obtain 



d„(A)«(i„(Ao) + ^-(A- Ao) 



(2.2) 



where we use boldface letters for either vectors or 2D matrices. The likelihood function, 
Eq.(2.1), is then, 



1 f^max 

- J](A-Ao)-F„.(A-Ao) 



n=l 



(2.3) 



C exp 

where for a given n we have defined 

= (W„)2 dXi dX, 

and {F)ij = X]„(F„)ij is the ij element of the familiar Fisher matrix. The statistical error 
on the parameter Aj is 



1 d{dn) d{d„ 



(2.4) 



(2.5) 



Assume that we have a modeling error, or numerical imprecision, and that our theoret- 
ical model shifts by Adn, i.e. dn ^ dn + Adn, that is either unknown or otherwise cannot be 
quantified and accounted for. In the following we consider a single data point for notational 
simplicity, for which the likelihood function is 



exp 



[Ad+^- (X 



Ao)]^ 



2{6dy 



(2.6) 



Solving for the peak of the likelihood function, we obtain the shift, i.e. bias, in the best-fit 
parameters 



Ad 

(5d)2 



d{d) 
~d\' 



(2.7) 



Now, assuming the bias and statistical error are uncorrelated we can add the nominal 
statistical error (Eq. 2.5) and the bias (Eq. 2.7) in quadrature 



-I 



(2. 



Comparing Eqs.(2.5) & (2.7) we see that in order for the model bias Ad to generate a 
significant parameter bias it has to be at the level of (or larger than) the variation of the 
theoretical d over the 'allowed' range of parameter values Ad > other words, 

while statistical error drops with increasing number of modes, any systematic bias in the 
theoretical model does not; therefore, for a given theoretical or modeling accuracy, there is a 
maximal mode beyond which the bias exceeds the statistical error. This implies that even if 
Ad is smaller than d by a very large factor it can still result in a much more significant bias 
in parameter inference if the sensitivity of d to small variations in the cosmological model is 
large. In that case the bias in the data. Ad, can mimic parameter shift, resulting in a biased 
parameter inference. 
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2.1 CMB and Matter Power Spectra 

To illustrate our basic argument in the context of the CMB it is sufficient to consider the 
temperature-only dataset. In the absence of instrumental or any other noise source we have 



6Ci 



1 



{21 + l)fsky 



Cl, 



(2.9) 



which is the cosmic variance. Note that as the sky fraction, fsky, is smaller, sample variance, 
6Ci, increases. In the presence of instrumental noise this expression is modified to 



SQ 



1 



(2/ + l)fsky 



(Ci + Cf 



(2.10) 



where (7"°**"^ is the noise power spectrum. Typically, the detector (instrumental) noise sets 



a natural angular cutoff due to the final beam size, 

(At 



^nmse 



(2.111 



where 9i, = ^J(S\nii)ah is the beam full width at half maximum (FWHM), and Ay charac- 
terizes the detector's white noise level in temperature units. Power at multipoles / > (cTf,)"^ 
is exponentially downweighted. In other words, the experiment is cosmic-variance-limited 
only for Q < 

In this CMB case Eqs.(2.3)-(2.5) yield 



C = exp 



(A - Aq)- 
2al 



where 



sky 



2l + l\ {dCi/dXf 



-1/2 



(2.12) 



(2.13) 



is the standard devitation of the parameter A, i.e. its la statistical uncertainty, and for 
simplicity we define 



(2.14) 



Now, the bias in the parameter A in the presence of unaccounted-for contribution to the 
observed power spectrum, i.e. Cj"^ — >• C/°* -|- AC/, is easily derived from Eq. (2.7). We 
attempt at fitting the observed power spectra with the 'wrong' theoretical model. For exam- 
ple, we fit the primordial CMB 'contaminated' by cluster- or filament-induced temperature 
anisotropy, the thermal Sunyaev-Zeldovich (SZ) effect, with theoretical model that accounts 
for the CMB only, without inclusion of filament-induced power. Doing so will clearly result 
in shifting (biasing) the best-fit A, i.e. Xq ^ Xq + 5\. Of interest is the dimensionless bias, 
i.e. the bias in units of nominal statistical uncertainty, which gauges the bias importance; 
^ > 1 indiactes a relatively large bias, whereas the bias is small if ^ < 1- Following a 
similar procedure to Eqs.(2.6)-(2.8), when there is an unaccounted contribution to the power 
spectrum, the likelihood function is 



C 



exp 



I 



2{5Cif 



(2.15) 
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and the bias in the parameter A is 



Xa^ =Xo(J^ -Jskyl^ly-^jj^t^- (2.16) 



Therefore, the dimensionless bias reads 

- - -fskyax ^ [-^ ) -J^. (2.17) 

Biased cosmological parameter inference from CMB probes have aheady been discussed in 
the context of patchy reionization models (e.g. [6-7]) and residual 'contamination' of the 
CMB sky by undetected galaxy clusters [8-9]. 

It is constructive at this point to consider a toy model that will provide an order of 
magnitude estimate of how large can the bias be. Assume for this toy model only that both 
and AC; are independent of /. In this case, Eqs.(2.13) & (2.17) give 

fsky^2 \~^^^fdC/dX\ 



"max 



Sx _ fsky J AC 

where here Im is the effective maximum /. 

For PLANCK and a CVL experiment, Im ^ 2500-3000 and fsky ~ 1- This imphes that 
for a fixed 6\/a\ « 1 the requirement is that |AC|/C < l/l, and that even if |AC|/C is at 
the 0.1% level, we expect the dimensionless bias to be of order unity, i.e. it starts competing 
with the statistical uncertainty. Clearly, all power spectra in our case are /-dependent and 
this toy model does not apply. Nevertheless, it illustrates that while the large number of 
/-modes decreases the statistical uncertainty (as l^), the bias is hardly affected. As a result, 
the dimensionless bias scales as Im- We note that CMB power spectra have recently been 
sensitively measured at multipoles up to / = 10"* with the South Pole Telescope (SPT) and 
Atacama Cosmology Telescope (ACT) high-resolution ground-based telescopes. 

The impact of bias induced by uncertainties in modeling the evolution of the LSS and its 
properties is determined from analysis of the matter power spectrum, P{k). From Eq.(2.4) 
the Fisher matrix is 

= ^^.AT, (2.19) 



k - dXi dXj 



where [10] 



z^a. / n{z)P{k,z) Y dV 



and kmax and kmin mark the smallest and the largest scale in the survey. The number of 
modes sampled by the probe is set by kmax, and is roughly ~ f^max ^2-1)3 ■ The analog of 
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Eq.(2.18) in this case is 



N 



^ ^ N'/^^. (2.21) 

0"A " 

Combining the results from Eqs.(2.18) & (2-21) we arrive at the basic requirement that 
the fractional error in either the angular or matter power spectrum should satisfy 

< 1 



\AP{k)\ ^ 1 



(2.22) 



Pik) - ViV 

for each I or k, namely that the fractional model error in the power spectra should be smaller 
than the square root of the number of modes. (Note that the total number of modes in a 
CMB experiment scales as ~ Imax^ ^-S- -^l- 2.13.) 

It is a standard practice in the literature to show that the power spectra of systematics, 
foreground residuals, modeling errors, etc., are suppressed to below the cosmic variance level. 
This is warranted by a marginalization procedure over the systematics that results in what is 
presumed to be an unbiased estimate of the cosmological parameters for a relatively low cost 
of (usually) insignificant increase in the statistical uncertainty of the inferred cosmological 
parameters. However, in doing so it is tacitly assumed that the systematics model (or the 
residual systematics model) statistically fluctuates around the exact model; this assumption 
is rarely the case: Significant variation between theoretical models for statistical measures 
of the SZ effect is a relevant illustrative example. Predictions from these models are never 
found to fluctuate around each other. Rather, for virtually any two models for the SZ power 
spectrum (normalized to the same level at a given scale, e.g. I = 3000) one typically flnds that 
one of the models overestimates the power on small scales while the other overestimates it 
on larger scales. In other words, two different SZ models will typically have a different shape 
in multipole space, effectively undermining the basic assumption behind the marginalization 
procedure. If this marginalization path is nevertheless adopted, it would lead to an unrealistic 
level of uncertainty (which is derived in Appendix A), 

^ < 

^ < A'-/^ (2.23) 

bounds that are weaker than the bias-free parameter inference requirements, Eq.(2.22). The 
reason for this is that in deriving Eqs.(2.22) we consider only that part of systematics, 
foregrounds, or modeling errors, that systematically increases or decreases the total power 
spectrum, in contrast to Eq.(2.23) which is obtained by assuming that all these systematics 
contribute power which fluctuates around the exact power spectrum. 

3 CMB and LSS Precision Requirements 

The above general assessment of the bias has important implications for the realistic degree 
of precision that can be attained in CMB (2D) and LSS (3D) probes, as we now demonstrate. 



-6- 



We consider the CMB as a representative for probes that are based on the angular power 
spectrum. Similar probes that will not be discussed here are weak gravitational lensing, and 
redshifted 21-cm analyses based on angular power spectra. 

3.1 CMB Probes 

As previously discussed by Seljak et al. [1], the required numerical precision of CMB Boltz- 
mann codes at a given mode Hs if the numerical errors are uncorrelated, i.e. fluctuating 
in I. If, on the other hand, there is a systematic (i.e. /-correlated) error in the power spectrum 
calculation it must satisfy 

lAGI / I , , 

< - (3.1) 



Ci ~ I 

in order not to bias the parameter inference beyond the statistical error, at that given /. 
The argument is simply that of mode counting; in a mode-annulus of modulus / and width 
Al there are 27rlAl modes (assuming statistical isotropy). The statistical error in estimating 
the angular power spectrum is therefore 5Ci/Ci ~ l/Vi (assuming no mode-correlation). 
However, assuming 6Ci are correlated, i.e. they are systematically lower or higher than the 
real power spectra, the requirement becomes |AQ|/C/ < l/l. This will offset the 5\/a\ oc 
N^/'^ dependence of the bias, where for the CMB - and other angular power spectra, such as 
those used in weak lensing shear maps or 21-cm forecast - the number of modes is N ^ P. 

The primordial CMB power spectrum dies off very quickly beyond / ~ 1000 and as- 
suming a multipole cutoff Imax = 3000 is reasonable. Comparing three Boltzmann codes, 
Seljak et al. [1] find that the 0.1% numerical precision is marginally achieved. Lesgourgues 
[11] illustrated that the CLASS and CAMB codes agree at the 0.01% level assuming the 
same evolutionary history. Recently, recombination modules for the CMB Boltzmann codes 
have been updated to include small corrections that resulted in only ~ 0.1% — 0.2% depar- 
tures between CosmoRec [12-13] and HyRec [14-15], again marginally satisfying the bound, 
Eq.(2.22). 

Achieving the goal of percent-level precision in determining the cosmological parameters 
may not be realistic given the various sources of systematics. It has recently been shown in 
[16] that the beam window function of PLANCK could be calibrated at the ~ 0.1% level 
using a parametric beam model, but this degrades by a factor of a few if a non-parametric 
model is assumed. A systematic error higher than this benchmark will necessarily propagate 
into the recovered power spectra and will ultimately bias the inferred values of cosmological 
parameters. In Appendix B we provide a more quantitative discussion of the precision level 
that beam calibration has to satisfy in order not to violate Eq.(2.22). 

The multifrequency capability of many CMB experiments will enable relatively precise 
removal of most astrophysical foregrounds, due to their non blackbody spectrum. A notable 
exception is the kinematic Sunyaev-Zeldovich (KSZ) effect which is essentially a first order 
Doppler shift of the CMB temperature, and therefore does not alter the blackbody spec- 
trum. The estimated level of the KSZ from patchy reionization models is 1.5 — 3.5/iii'^ [17]. 
This range reflects the theoretical uncertainty in reionization models; it was obtained from 
studying the impact of ~ 100 models on the CMB temperature anisotropy. The impact of 
astrophysical processes, degree of patchiness over the relevant redshift range, and various 
feedback processes, is estimated at the ~ 2 — 3fiK^ level at / = 3000, e.g. [18-21]. Overall, 
this range of variation of the KSZ power due to modeling uncertainty represents 0.1 — 10% 
perturbation to the primordial CMB on the relevant multipole range 1000 < I < 3000. Using 
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the non-gaussianity of the SZ effect to remove this contribution is a reasonable possibihty, but 
we are not aware of any quantitative study that demonstrates that the residual contribution 
to the CMB power spectrum will satisfy Eq.(2.22). 

Taburet et al. [8] have shown that the thermal SZ effect, induced by hot gas in galaxy 
clusters, will significantly bias a few key cosmological parameters, even if the most luminous 
galaxy clusters detected by PLANCK are masked. This is due to the rather significant con- 
tribution made by the undetected clusters to the CMB temperature power spectrum and the 
finite number of frequency bands, instrumental noise, and foregrounds that limit the mass 
and redshift of detectable clusters. While this might not necessarily pose a significant chal- 
lenge to PLANCK science, because cluster masking can benefit from non-CMB surveys that 
are projected to detect ~ O(IO^) clusters, it has recently been shown that warm filamentary 
structures may introduce comparable bias in the inferred cosmological parameters [9]. 

More generally, it remains to be demonstrated that the precision of foreground removal 
techniques can realistically attain the required level implied by Eq.(2.22). In the specific case 
of the thermal SZ effect in clusters, it should be emphasized that calculations of the power 
spectra do not agree at even the few percent level due to the highly model-dependent nature 
of the effect [22-24]. This applies not only to the amplitude but, more import ant antly, to 
the /-dependence of the SZ power spectrum. For example, Millea et al. [25] considered the 
possibility of mitigating the foregrounds-induced bias of inferred cosmological parameters 
by representing the foregrounds by 17 parameters which resulted in only < 20% increase in 
the nominal uncertainty. While this result is encouraging, in order to assess the reliability of 
the inferred cosmological parameters from PLANCK, compelling evidence has to be provided 
that parametrizing the uncertainty in astrophysical foregrounds with only 17 parameters fully 
captures the /-dependence at the required \ACi\/Ci < 1// level. As we show in Appendix C, 
allowing nuisance free parameters in a 'wrong' model does not generally guarantee a bias-free 
cosmological parameter inference. 



3.2 LSS Surveys 

3D surveys that target the matter power spectrum, P{k), clearly probe a larger number of 
modes than the 2D CMB angular power spectrum. The number of modes is ~ f^maa;^^) 
where (as specified in the previous section) V^// and kmax are the effective survey volume 
and maximum wave number, respectively (Eqs. 2.20). For a ~ IGpc survey^ Sjiid kfyidx 
0.1Mpc~^ the total number of modes is ~ 5300, and the required precision on the matter 
power spectrum is \AP{k)\/ P{k) < 1.3%. The SDSS Lumonius Red Galaxy (LRG) survey 
used 42,000 modes [26] in the O.Olh/Mpc < k < O.lh/Mpc range. Reid et al. [27] probed 
deeper into the quasi-linear regime and used data up to /c < 0.2h/Mpc, which resulted in 
increasing the mode number by a factor of ~ 8, thereby significantly improving the constraints 
on the cosmological parameters. All these galaxies are observed at z < 1; how well do we 
know the matter power spectrum at the quasi-linear regime ? Calculation of the evolution of 
P{k; z) can be done perturbatively. Two-loop corrections still contribute at the ~ 1% level 
at quasi-linear scales, e.g. [28]. Going to higher order in regularized perturbation theory 
entails performing integrations at D = 3n — 1 dimensions where n is the loop order, e.g. [29]. 
This is expected to be computationally quite prohibitive if the desired numerical accuracy, 
Eq.(2.22), is to be achieved. For example, on scales where 3rd order perturbation terms are 
non-negligible, 8D integrations should be done that have to be precise at the 10~^ — 10~^ 
level, say, and this will have to be repeated for each parameter set in the multi-dimensional 
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parameter-space search for the best-fit cosmological model. It is unclear whether this is 
realistic. 

An alternative is to employ numerical simulations; carefully choosing initial conditions, 
time steps, sufficient mass resolution and large volumes will allow reaching the rather im- 
pressive 1% accuracy at ~ IMpc [30], but this without including the effect of baryons 
which in itself is expected to contribute at the percent level on these small scales. Also, to 
be useful for Monte-Carlo cosmological parameter search, these calculations of the nonlinear 
matter power spectrum have to be very fast. Running the simulations for only a few cases 
and interpolating between parameter values might result in errors larger than those allowed 
for unbiased cosmological parameters. In addition, a comparison of these simulations with 
CAMB's HALOFIT reveals a 5-10% discrepancy [30]. 

Another possible avenue is to employ Artificial Neural Networks (ANN) for a fast cal- 
culation of the matter power spectrum. Agarwal et al. [31] claim to have reached the < 1% 
precision on scales k < 0.7 hMpc~^ and at redshifts z < 2. However, these neural networks 
have been trained with HALOFIT, which is itself discrepant with [30] at the few percent 
level. 

In parameter estimation forecasts it is customary to adopt the nonlinear-scale cutoff 
at scales where the mass fluctuation inside mass spheres of radius R, is of order unity: 



cr{R) = y ((-^)^)_R ~ 0.5, e.g. [32-33]. While this definition of nonlinear scale is intuitive, 
parameter bias is completely unaccounted for. In fact, it turns out to overly under-estimate 
the impact of our ignorance of the nonlinear matter power spectrum on the bias of in- 
ferred cosmological parameters. In their parameter estimation forecast for post-reionization 
redshifted-21-cm surveys, Visbal, Loeb, & Wyithe [34], determined the largest mode k^ax 
by comparing the nonlinear matter power spectrum from HALOFIT to the linear power 
spectrum at the various redshifts they considered. They defined k^ax at the scale where the 
nonlinear deviates from the linear power spectrum at 10%. They also explored the robust- 
ness of their forecast to varying this criterion in the range 5-25% and indeed their analysis 
shows that the parameter uncertainties degrade as kmax (iii the range where cosmic variance 
dominates over instrumental noise). Since their 21-cm observation is volume-limited, the 
number of modes can be readily calculated, ~ 6.75 x 10^^ k"^^^, where k^ax is in Mpc~^ 
units. Even if we take their most stringent kmax = 0.1Mpc~^ , we obtain that for Eq.(2.22) to 
be satisfied one must know the matter power spectrum to better than one part in 10^. This 
implies that realizing the potential of future post-reionization redshifted-21-cm observations 
seems unlikely. Given the current level of precision in calculations of the matter power spec- 
trum, the number of modes will have to be drastically reduced; this will result in a significant 
weakening the stated scientific yield of these probes, removing their competitive advantage 
over other cosmological probes. 

The tantalizing merits of the pre-reionization 21-cm at very high redshifts, and its 
statistical power to constraining cosmology, have been first advocated by Loeb &: Zaldarriaga 
[35]. One may contemplate that pre-reionization 21-cm at high-redshifts is immune to power 
spectrum non-linearity, which is indeed the case above some redshift-dependent sub-Mpc 
scale. However, use of data down to the baryon Jeans scales comes with the penalty of 
incurring a very large bias; the number of modes in these observations is estimated to fall 
in the range 

10i4 _ 10^6 , which win require knowing the matter power spectrum at the 
~ 10~^ — 10~^ precision. Indeed, it has been shown in [36] that nonlinear corrections to 
the matter power spectrum, even at redshifts as high as 30 or 50, may contribute at the 
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sub-percent level at sub-Mpc scales. As discussed above, while accounting for higher order 
corrections to the matter power spectrum is theoretically possible (even if at the cost of 
significantly slowing down the search in the multidimensional parameter space), it is not 
clear how many such terms should be included in order to reach the fantastic ~ 10~^ — 10~^ 
precision entailed by requiring unbiased parameter estimation from all scales down to the 
baryon Jeans scale. Cutting off the data above some scale kmax larger than the Jeans scale 
kj will degrade the statistical uncertainty by (^m^)3/2^ 

These considerations are especially relevant when assesssing the scientific yields of future 
surveys. For example, Sigurdson & Cooray [37] considered the possibility of delensing the 
polarized CMB sky with high-redshifted 21-cm tracer of gravitational lenses to the level that 
will allow constraining the energy scale of inflation down to V^^'^ ~ 1.1 x 10^^ GeV, equivalent 
to tensor-to-scalar ratio T/S ~ 1.0 x 10"^ with /max ~ 10^. This represents ~ 3 orders of 
magnitude tighter constraint than the ideal CMB experiment. However, reliably inferring 
the energy scale of inflation using this method requires a ~ 10~^ precision of the theoretical 
21-cm anisotropy model, a goal that we believe has not been demonstrated to be realistic. In 
a recent similar work, using /max ~ lO'^, Book, Kamionkowski Sz Schmidt [38] claim that the 
signature of inflationary primordial gravitational waves with as small as T/S ~ 1.0 x 10~^ 
could be detected with future high-redshifted 21-cm observations. This further boosts the 
model precision requirement to one part in 10''. 

It is well appreciated that realizing the potential of the 21-cm probe will be extremely 
challenging given that foregrounds are expected to be ~5 orders of magnitude larger than the 
21-cm signal. In Fisher matrix forecasts of the science yield of these 21-cm observations, it 
is common to model the frequency-dependence and spatial correlations of these foregrounds 
and marginalize over the model free parameters, e.g. [2], [4], [39-42]. These models are 
often extrapolated from other frequency regimes, or are otherwise only partially physically 
motivated, and are often proposed largely due to their functional simplicity. However, for 
unbiased cosmological parameter estimation the marginalization process only makes sense 
when the model functional form, i.e. /c-dependence, faithfully captures the shape of the ob- 
served power spectrum; the extra freedom enabled by adding the nuisance parameters, which 
are subject to marginalization, does not guarantee a bias-free parameter inference. In other 
words, we generally do not expect that using an inadequate model can be fully compensated 
for by simply marginalizing over free nuisance parameters (as argued in Appendix C). 

So far we considered only the systematics in the primordial matter power spectrum. 
In reality, the matter power spectrum is determined from observations of dark matter bi- 
ased tracers, e.g., galaxy clustering, galaxy clusters, and perhaps also from redshifted 21-cm 
neutral gas (that follows dark matter halos) in the future. The observed power spectrum 
is skewed by a scale-dependent multiplicative bias. The observable in galaxy surveys is 
Pg{k; z) = lP'{k, z)P{k; z), e.g. [43]. The calculation of this luminosity-dependent bias is 
highly non-trivial and model-dependent, so one may question its precision. In the case of 
21-cm observations at the reionization era, we assume the surveys to be only volume-limited; 
this eases the bias calculation in the sense that it is only redshift dependent in that case. 
For far- future high-redshift 21-cm observations at high redshifts, there is very little halo bias 
and in that sense this probe is more immune to this systematic source. 

For the bias b{z) calculation one needs to specify the halo mass function. Until only 
recently the mass function of choice was that of Seth &: Tormen [44-45]. This was superseded 
by the Tinker mass function [46-47], but there are several other mass functions. Typically, 
they are discrepant at more than a few percent on the relevant mass-redshift range. In most 
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forthcoming analyses we will only reliably know the redshifts of galaxies or galaxy clusters; 
their mass inference is expected to be very challenging and quite biased. When comparing 
observations to theory a mass function has to be selected in order to average the theoretical 
bias b{M, z) over mass. Also, it is not clear how this average should be performed; is it a 
simple mass average, e.g. [48] ? Is it mass- weighted average over redshift, e.g. [34] ? It 
is therefore clear that uncertainties in mass and in the mass function imply the need for 
arbitrary choices that may easily result in a few percent systematic bias in all fc-modes, 
ultimately biasing the inferred cosmological parameters. 

4 Discussion 

The concordance cosmological model, corroborated by many different cosmological surveys 
that probe the linear and quasi-linear large scale structure, is surprizingly simple on the 
largest scales: Of all concievable universes ours seems to have had initial conditions that 
lead to relatively simple LSS, and with properties that are characterized by only a dozen 
cosmological parameters. It now appears that our most important challenge is precision 
measurements of these parameters. The ability to do so depends largely on the sensitivity 
and extent of future cosmological surveys. More specifically, the level of precision depends 
on the susceptibility of the cosmological model to small variations in the key parameters, the 
number of independent (Fourier) modes of survey data, and on the degeneracy between the 
parameters. 

In reality various factors can limit the quality of our deductions from even the most 
precise measurements; almost unavoidably these include some degree of bias in the inferred 
cosmological parameters. In this work we highlighted the quantitative 'tension' between the 
statistical error that drops as A^~^/^ and the dimensionless bias that typically grows as N^^"^, 
where the number of modes, A^, increases as cosmological experiments become significantly 
more sensitive. Realizing the potential of redshifted 21-cm experiments that will explore 
10® — 10^^ modes requires controlling various systematics, including modeling and simula- 
tion uncertainties and foreground removal, to the level of O(10~'^)-O(10~^). Put differently, 
whereas a model bias of O.OOlfi could be identified with the CMB satellite WMAP (~ 10^ 
modes) at la, a bias as small as 10~^(T would be discerned with future 21 cm observations 
(~ 10^^ modes); a model bias larger than this would therefore be at odds with observation 
and in any case would bias the inferred cosmological parameters. As emphasized above, 
our main concern in this work has been those uncertainties that systematically shift the 
power spectra in an unknown fashion (for which we adopt Eq. (2.22) as the benchmark 
precision criterion). We stress that systematics that fluctuate around the exact model can 
be marginalized over and typically result in mild degradation of the nominal cosmological 
parameter uncertainties without inducing any bias [in which case the required precision is 
summarized in Eq. (2.23)]. However, for this to be the case the model has to exactly capture 
the shape of the power spectra over the entire range of variation of the cosmological param- 
eters. Only then could the model nuisance parameters be marginalized over and ensure a 
bias-free cosmological parameter inference. 

With the steadily increasing number of accessible CMB and LSS modes the statisti- 
cal uncertainty in inferred cosmological parameters is ought to improve. However, even a 
slight bias in the theoretical modeling, data analysis, and foreground removal, can poten- 
tially accrue with increasing number of modes to the level that may possibly bias the best-fit 
cosmological model far beyond the nominal statistical uncertainty. This poses a challenge to 
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next generation LSS probes, especially to redshifted 21 cm observations which are theoreti- 
cally limited only by the baryon Jeans scale, and are poised to ultimately yield measurement 
database covering a huge number of ~ 10^^ modes with unprecedented capability for precise 
cosmological parameter inference. 

The simple rule of thumb that we highlighted in this work, that the statistical error and 
dimensionless bias on the inferred cosmological parameters scale as ~ 

and ~ iVi/2, 

respectively, assumes that the cosmological information is entirely contained in the angular or 
matter power spectrum. However, constraining primordial non-gaussianity requires working 
with the angular (matter) bispectrum, in the 2D (3D) cases. When the main goal is just to set 
observational bound on the degree of non-gaussianity, without specifiying the non-gaussianity 
class, then all triangular configurations in mode-space are allowed, thereby increasing the 
number of modes (for a given observational resolution) used in determining the primordial 
non-gaussianity. This only makes the theoretical precision requirements from the model 
(which is contrasted with the data) more demanding. 

Our objective in this work has been to highlight a major limitation of upcoming cosmo- 
logical surveys, especially those that are expected to greatly benefit from the huge number 
of modes hitherto unprobed by the current lower resolution and higher noise probes. Strictly 
speaking, our simple arguments apply to either a cosmological model with a single free pa- 
rameter, or to multi-parameter model when the parameters are uncorrelated. In practice, this 
is never the case and cosmological parameters do correlate. As the number of usable modes 
increases there is more information in the data to allow breaking the cosmological parameter 
degeneracies. However, as we argue in this work, using a larger number of modes could lead 
to a stronger bias if the model is not sufficiently accurate. To optimize the number of modes 
used in a given cosmological survey the maximal number of modes should be chosen such 
that the statistical error and bias added in quadrature will result in the minimum possible 
error on a given set of parameters. 
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A Precision Requirements: Random Modeling Errors 

We outline the rationale behind the standard requirement on precision of theoretical power 
spectra, pointing out explicitly the assumption behind the derivation of Eq.(2.23). We do 
this for the case of angular power spectrum; extending this derivation to the case of the 
matter power spectrum is straightforward. Analogous to parameter marginalization one can 
account for residual foreground (or other type of modeling uncertainty) by simply modifying 



Eq.(2.15) 
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where we marginalize over the Z-dependent modeling error of the power spectrum (assuming 
that foregrounds and other systematics have already been accounted for) and carry out a 



functional integration over it with a gaussian prior on the power spectrum modeling uncer- 
tainty with being the 1 — a prior on the residual modeling error AC;. For simplicity we 
assume it is symmetrically distributed around 0, since otherwise this itself would introduce 
a bias. Carrying out the integration one obtains that SC^ in Eq.(2.15) is replaced by 

SCf^6C! + a'c,. (A.2) 

Thus, the inferred parameter is unbiased at the cost of increasing variance. As 6Ci decreases 
as we expect the residual SZ power with cr^^ to dominate the denominator (in the above 
sum) at some sufficiently large I, resulting in degraded statistical uncertainty of several cosmo- 
logical parameters. This would typically be perceived as a reasonable cost when compared to 
the bias that would otherwise be introduced. The first relation in Eq. (2.23) can be obtained 
from the following consideration: If 6Cf oc Cf/l then the requirement is acJCi < 
and recalling that the number of modes N ~ I'^ax it follows that bCijCi < A^~^/^. It is 
straightforward to obtain also the second relation in Eqs.(2.23) by employing similar reason- 
ing for the 3D matter power spectrum, P{k), and recalling that here ~ ^max^ ■ ■f'o^ very 
large mode numbers, A^, Eqs.(2.22) & (2.23) give markedly different requirements. In this 
Appendix we show that making the requirement Eq.(2.23) is only warranted under the very 
strong assumption that the residual systematic power spectrum fluctuates around 0. In gen- 
eral, our modeling uncertainty of systematics, foreground residual, etc., is hardly of this type; 
in fact, we rarely know the systematic power spectrum up to a fluctuating part. Relevant 
examples include the statistical thermal SZ effect, the KSZ effect from patchy reionization, 
foreground clustering, and beam systematics. 

B Beam Calibration Requirements 

Features in sky maps on scales smaller than the angular resolution of the telescope are 
effectively smeared by beam dilution. Conventionally, the telescope angular response function 
is modeled as a circular or elliptical gaussian with superimposed low order polynomials. By 
calibrating the beam against a point source such as Mars, Jupiter, or Saturn, the best fit 
model parameters are obtained. 

For simplicity, we model the beam dilution effect as a circular gaussian, but the result 
will hold for more general beam models. The measured power spectrum obtained from the 
map is 

^measured ^real ^—l^a^ ^-g 

where at is the gaussian width. Once the beam is calibrated and ab is obtained the measured 
power spectrum is multiplied by e '^b and the best-fit cosmological model is then obtained 
from comparing theory and observation. In practice, the model chosen for the beam descrip- 
tion might not always capture all beam features, such as sidelobes, higher order moments 
beyond the quadrupole (parameterized by the beam ellipticity), and non-gaussian features, 
etc. Choosing the 'wrong' model might then skew the best-fit procedure. It is conceivable 
that the beam calibration will then result in an effective beam width ab' either larger or 
smaller than ab- In this case, the recovered power spectrum will be 

/^recovered /^measured J? cr^, 
Ci = e b' 



(B.2) 



which is systematicahy larger or smaher than the actual power spectrum (7j"eas?ire(ig«2g.2^ The 
fractional error in the angular power spectrum is therefore 

'r2r 2 2 



Ci 



exp[/^K-(7,^,)]-l- (B.3) 



Assuming the difference at — (Jb' is small, and combining this with Eq.(2.22) results in the 
condition 



2ZV,V < l/l (B.4) 

where we defined the beamwidth missmatch = \(Tb — <^b'\/ (^b- Now, recalling that the 
maximum multipole probed by the telescope is roughly where the beam window function 
drops to a value times its peak value, we can set Imax^b ^ 1 in Eq.(B.4) and obtain 

2^1 < l/lmax (B.5) 

implying that for a beamsize of ~ 5 arcminute (the smallest of the PLANCK/HFI instru- 
ments) the beamwidth has to be calibrated at the 0.1 arcsecond precision, which will be 
challenging given that point sources (such as planets) morphology in the microwave is not 
known to this very high precision level. 



C Marginalization and Residual Bias 

Marginalizing over model uncertainties is a standard practice in analyses/forecasts of cos- 
mological datasets. This technique is especially relevant for addressing residual foregrounds 
or systematics. Accounting for these model uncertainties is conventionally done by parame- 
terizing systematics and foreground models. In most cases these somewhat arbitrary model 
choices are motivated by either mathematical simplicity or well-understood physics extrapo- 
lated to the regime of interest. As mentioned in Section 3 (in our discussion of the CMB and 
21-cm surveys), this procedure may not meet the precision standards required by unbiased 
cosmological parameter inference, as put forward in this work. 

Here we argue that the bias derived in section 2 cannot be simply integrated away 
by assuming a model for the foregrounds and systematics with free nuisance parameters. 
Therefore, although Fisher matrix analyses of the future performance of marginalization 
techniques applied to CMB and 21-cm observations typically result in only a mild increase 
in the statistical error, this procedure by no means alleviates the bias problem. 

Generalizing Eq.(2.6), we denote the model A.dmod for the systematic Ad, so that the 
likelihood function can now be written as 



exp 



[A(i-Aciw + #-(A-Ao)] 
2{5dY 



(C.l) 



We further parameterize Ad^od = ^Ad with A denoting a nuisance model parameter that 
we marginalize over, assuming a gaussian prior, with the dependence on the Fourier mode (/ 
or k) included in Ad 



dAexp 
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where Aq and aA characterize the prior range for the nuisance parameter A. This parameter 
can, for example, be Asz of the recent SPT and ACT parameterization of the amphtude of 
the SZ angular power spectrum. In this case it is unlikely that N-body simulations or analytic 
modeling will result in an SZ angular power spectrum shape Ad identical to the actual Ad 
to the required precision \AC^^\/Ci < 1// up to the maximum I ~ 3000 or 4000 used in 
the SZ analysis. Carrying out the integration over A in Eq.(C.2) in the range [—00,00] is 
partially justified by assuming that it is known to be several a above zero, and partially 
warranted by the fact that if we set the lower integration limit to zero (or a sufficiently small 
value) then the integration over A in an asymmetric range will only weaken the 'power' of 
marginalization, resulting in even a larger bias that what we estimate here. We obtain 



C" = exp 

where 



0^ 1 

^ (C.3) 

2 (W)2+a2(A(i)2 



/3 = Ad-AoAd+^^-(A-Ao). (C.4) 

Comparing this to Eqs.(2.6) we see that the bias and statistical uncertainty changed 

Ad ^ Ad- AoAd 
{6df ^ {5df + al{Adf . (C.5) 

Typically, 5d increases by a factor of order unity, e.g. [4], [25], and while one wants to use a 
model where \Ad — AQAd\ < 1/y/N, this is seldom the case (if at all). In the case of CMB 
and its SZ foreground, different SZ models do not even agree (in amplitude and shape) to 
better than a few tens of percent, far above the \Ad — AoAd\ < 1/lmax requirement. 
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